A Split-merge Markov Chain Monte Carlo Procedure for the Dirichlet Process Mixture Model

نویسندگان

  • Sonia Jain
  • Radford M. Neal
چکیده

We propose a split-merge Markov chain algorithm to address the problem of inee-cient sampling for conjugate Dirichlet process mixture models. Traditional Markov chain Monte Carlo methods for Bayesian mixture models, such as Gibbs sampling, can become trapped in isolated modes corresponding to an inappropriate clustering of data points. This article describes a Metropolis-Hastings procedure that can escape such local modes by splitting or merging mixture components. Our Metropolis-Hastings algorithm employs a new technique in which an appropriate proposal for splitting or merging components is obtained by using a restricted Gibbs sampling scan. We demonstrate empirically that our method outperforms the Gibbs sampler in situations where two or more components are similar in structure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Splitting and Merging Components of a Nonconjugate Dirichlet Process Mixture Model

Abstract. The inferential problem of associating data to mixture components is difficult when components are nearby or overlapping. We introduce a new split-merge Markov chain Monte Carlo technique that efficiently classifies observations by splitting and merging mixture components of a nonconjugate Dirichlet process mixture model. Our method, which is a Metropolis-Hastings procedure with split...

متن کامل

A Split-Merge MCMC Algorithm for the Hierarchical Dirichlet Process

Abstract The hierarchical Dirichlet process (HDP) has become an important Bayesian nonparametric model for grouped data, such as document collections. The HDP is used to construct a flexible mixed-membership model where the number of components is determined by the data. As for most Bayesian nonparametric models, exact posterior inference is intractable—practitioners use Markov chain Monte Carl...

متن کامل

Variable selection in clustering via Dirichlet process mixture models

The increased collection of high-dimensional data in various fields has raised a strong interest in clustering algorithms and variable selection procedures. In this paper, we propose a model-based method that addresses the two problems simultaneously. We introduce a latent binary vector to identify discriminating variables and use Dirichlet process mixture models to define the cluster structure...

متن کامل

An Improved Merge-split Sampler for Conjugate Dirichlet Process Mixture Models

The Gibbs sampler is the standard Markov chain Monte Carlo sampler for drawing samples from the posterior distribution of conjugate Dirichlet process mixture models. Researchers have noticed the Gibbs sampler’s tendency to get stuck in local modes and, thus, poorly explore the posterior distribution. Jain and Neal (2004) proposed a merge-split sampler in which a naive random split is sweetened ...

متن کامل

Sequentially-Allocated Merge-Split Sampler for Conjugate and Nonconjugate Dirichlet Process Mixture Models

This paper proposes a new efficient merge-split sampler for both conjugate and nonconjugate Dirichlet process mixture (DPM) models. These Bayesian nonparametric models are usually fit usingMarkov chain Monte Carlo (MCMC) or sequential importance sampling (SIS). The latest generation of Gibbs and Gibbs-like samplers for both conjugate and nonconjugate DPM models effectively update the model para...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000